152 ◾ Bioinformatics
Genomes Project. In addition to variant annotation with respect to genes, ANNOVAR
has the ability to perform annotation based on genomic region and to compare variants
to existing variation databases. In general, the types of annotations with ANNOVAR can
be grouped into the following: (i) gene-based annotation which identifies the effects of
variants on the proteins, (ii) region-based annotation identifies the affected region (e.g.,
conserved region), and (iii) filter-based annotation identifies variants based on a specific
database such as dbSNP, ExAC, 1000 Genome Project, and gnomAD. The filter-based
annotation may also generate scores including SIFT, PolyPhen, LRT, MutationTaster,
MutationAssessor, FATHMM, MetaSVM, and MetaLR.
ANNOVAR uses annotation databases to perform the above types of annotation.
The annotation databases are built with the organism annotation file in GFF3 format.
FIGURE 4.15 Number of variant effects by type and region.
FIGURE 4.16 A bar chart shows percentage of variants by region.